Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms

نویسندگان

  • Qianxiao Li
  • Cheng Tai
  • Weinan E
چکیده

We develop the method of stochastic modified equations (SME), in which stochastic gradient algorithms are approximated in the weak sense by continuous-time stochastic differential equations. We exploit the continuous formulation together with optimal control theory to derive novel adaptive hyper-parameter adjustment policies. Our algorithms have competitive performance with the added benefit of being robust to varying models and datasets. This provides a general methodology for the analysis and design of stochastic gradient algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multichannel recursive-least-square algorithms and fast-transversal-filter algorithms for active noise control and sound reproduction systems

In the last ten years, there has been much research on active noise control (ANC) systems and transaural sound reproduction (TSR) systems. In those fields, multichannel FIR adaptive filters are extensively used. For the learning of FIR adaptive filters, recursive-least-squares (RLS) algorithms are known to produce a faster convergence speed than stochastic gradient descent techniques, such as t...

متن کامل

Novel Stochastic Gradient Adaptive Algorithm with Variable Length

The goal of this paper is to present a novel variable length LMS (Least Mean Square) algorithm, in which the length of the adaptive filter is always a power of two and it is modified using an error estimate. Unlike former variable length stochastic gradient adaptive techniques, the proposed algorithm works in non-stationary situations. The implementation of the adaptive filter is described and ...

متن کامل

Stochastic modified equations and the dynamics of stochastic gradient algorithms A Modified equations in the numerical analysis of PDEs

where u : [0, T ] × [0, L] → R represents a density of some material in [0, L] and c > 0 is the transport velocity. It is well-known that the simple forward-time-centralspace differencing leads to instability for all discretization step-sizes (LeVeque, 2002). Instead, more sophisticated differencing schemes must be used. We set time and space discretization steps to ∆t and ∆x and denote u(n∆t, ...

متن کامل

A Unified Approach to Adaptive Regularization in Online and Stochastic Optimization

We describe a framework for deriving and analyzing online optimization algorithms that incorporate adaptive, data-dependent regularization, also termed preconditioning. Such algorithms have been proven useful in stochastic optimization by reshaping the gradients according to the geometry of the data. Our framework captures and unifies much of the existing literature on adaptive online methods, ...

متن کامل

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients. In this paper, we propose a new adaptive learning rate algorithm, which utilizes curvature information for auto...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017